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Abstract 

Detector response to a high-energy physics process is often estimated by Monte Carlo simulation. For purposes of data 
analysis, the results of this simulation are typically stored in large multi-dimensional histograms, which can quickly 
become both too large to easily store and manipulate and numerically problematic due to unfilled bins or interpolation 
artifacts. We describe here an application of the penalized spline technique [Ij to efficiently compute B-spline represen- 
^ tations of such tables and discuss aspects of the resulting B-spline fits that simplify many common tasks in handling 
tabulated Monte Carlo data in high-energy physics analysis, in particular their use in maximum-likelihood fitting. 

'O' Keywords: Splines, Monte Carlo, Histograms, Maximum Likelihood 
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1. Introduction 

Monte Carlo simulation is a common technique for 
estimating detector response in high-energy physics and 
tabulated forms of Monte Carlo simulation are commonly 
used in data analysis as probability density functions in 
maximum-likelihood fits. As the number of tabulated di- 
mensions increases, numerical problems begin to occur: 
each bin of the histogram becomes less well filled, statis- 
tical fluctuations become larger, zero bins appear. Use of 
such functions as probability densities with numerical min- 
imization algorithms in maximum likelihood fits can then 
cause substantial problems with convergence and makes 
parameter extraction using this approach difficult. The 
use of penalized spline fits [1] in place of raw histograms, 
however, allows the resolution of most of these problems 
without resorting to extremely coarse binning and intro- 
duces additional capabilities, such as analytic convolution, 
derivatives, and integrals that can improve fit quality even 
beyond the removal of numerical instabilities. 

The application described here arose in the context of 
the IceCube detector [51 [3] , which uses tabulated Monte 
Carlo data, among other purposes, for describing photon 
propagation in glacial ice for simulation of neutrino events, 
energy reconstruction of muons ^ , and for direction, po- 
sition, and energy reconstruction of electromagnetic and 
hadronic showers produced by Vf. and neutral current neu- 
trino interactions [5]. The heterogeneity of the glacial ice 
[5] results in a complicated dependence of light propaga- 
tion on direction, source depth, and receiver depth, result- 
ing in 6-dimcnsional tables required to describe the light 
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response of the detector to arbitrarily oriented and posi- 
tioned showers. Similar problems arise in parametrizing 
detector response for purposes of event reconstruction in 
many high-energy detectors, either because of variations 
in the detector medium (as with IceCube) or because of 
varied instrumentation technology (as in many collider de- 
tectors) . 

A raw table-based approach caused difficulties, espe- 
cially for directional reconstruction of showers: the tables 
can require immense amounts of memory (terabytes) and 
tabulation can introduce numerical artifacts that resulted 
in seams in the resulting likelihood space and prevented 
minimizer convergence. These numerical artifacts arise 
from averaging over a table cell during binning, from in- 
accurate interpolation between cells, and from statistical 
fluctuations as the bins are filled. Any attempt to reduce 
the size of the tables exacerbated the binning artifacts, and 
vice versa. An interpolating function was the only viable 
solution to these problems. 

An ideal interpolating function for these types of his- 
tograms would have several properties: it would be smooth, 
fast to evaluate, possible to fit deterministically in a small 
amount of time, extend easily to large numbers of dimen- 
sions, and have parameters that are easy to store, while 
minimizing introduction of bias and artifacts into the fit 
surfaces. Basis splines (B-splines) fit this description well: 
a spline of order n has n — \ continuous derivatives, the 
functions can be evaluated quickly, and fitting them to a 
histogram is a linear problem, so it can be done quickly 
with standard techniques. The underlying linear problem 
is inherently sparse, and can be computed efficiently even 
for very large histograms using Generalized Linear Array 
Models (CLAM [7]). The linear formulation can also be 
extended easily to include a regularization term [T] that 
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penalizes erratic sweeps in the fit surface in the absence of 
strong evidence from tlie data. 

In addition to the general properties listed above, our 
problem had a number of special requirements that we 
were able to satisfy through a combination of extensions 
to the GLAM/penalized-spline fitting technique and op- 
timized routines for the evaluation of tensor-product B- 
spline surfaces. First, our photon propagation tables were 
large enough (hundreds of gigabytes) to make it impracti- 
cal to load the entire 6-dimensional histogram into memory 
at once, required to perform any fit, so we implemented a 
method of "stacking" spline surfaces fit to 4-dimensional 
slices of the full histogram to form a full 6-dimensional in- 
terpolating spline surface. Second, we needed to assume 
the same tables for both simulation and reconstruction of 
events, which have somewhat different required informa- 
tion. In order to simulate events, we needed to be able to 
sample from the distribution of photon detection times at 
fixed points in space using its time-differential form, while 
for maximum-likelihood reconstruction we needed to be 
able to integrate the distribution of photon detection times 
over arbitrary intervals as well as efficiently evaluate the 
gradients of those integrals in all 6 dimensions. We accom- 
plished all three with a single spline surface by fitting the 
cumulative distribution of photon arrival times, using the 
technique of non-negative least-squares to enforce mono- 
tonicity, and using a basis of B-splines and their derivatives 
to simultaneously evaluate both the value and gradient 
of the resulting surface at little additional computational 
cost. Third, in the maximum-likelihood reconstruction we 
needed to be able to account for a family of detector ef- 
fects and event hypothesis uncertainties that can be ap- 
proximated by a convolution of the photon detection time 
distribution with Gaussians of various widths. In order to 
effect such convolutions without repeating the fit once for 
each kernel, we implemented a method for quickly com- 
puting the coefficients of the convolution of the fit surface 
with an arbitrary B-spline kernel |8j. 

We have implemented the spline-fitting techniques men- 
tioned above in a library written in C using sparse matrix 
operations provided by the SuiteSparse [Hj and GotoBLAS 
2 [10] libraries, as well as a lightweight library of functions 
for evaluating the resulting surfaces (code available in the 
CPC program library). Here we will describe the under- 
lying mathematics of these methods as well as the details 
of our implementation. 

2. B-spline surfaces 

To create a smooth approximating function in one di- 
mension, we take a linear combination of B-spline func- 
tions defined on a common knot vector fc, such that the 
nth B-spline of order m, B^, is defined on the subset of k 
from kn to fc„+,„+i. 

Functions defined in this way have several attractive 
properties. Because they are linear combinations of smooth 




Figure 1: A 2-dimensional tensor product spline constructed from 
two one-dimensional basis splines. 

functions, they are likewise smooth, the amount of infor- 
mation required to represent the function is small, and 
they can be fit to data using standard linear techniques 
such as Singular Value, QR, or Cholesky decomposition. 
In addition, because all but m -f 1 of the basis functions 
are identically zero at any particular point (they have local 
support), to evaluate the approximation function we need 
only evaluate the de Boor algorithm [TT] m + 1 times. This 
vastly improves computation time for lookups compared to 
functional representations like Fourier decompositions or 
polynomial interpolation that require evaluations of basis 
functions at every point. 

We can extend this approach easily into multiple di- 
mensions by adopting the concept of the tensor product 
spline. Instead of a knot vector, the tensor product ba- 
sis functions are defined on a rectangular n-dimensional 
knot grid k formed by taking the tensor product of n one- 
dimensional knot vectors. Each basis function is then a 
product of one-dimensional B-splines (Figure [T]), which are 
taken in linear combination as in the one-dimensional case: 

B{x) ^a-B^a- [b\{xi) ® B^^{x2) ® ■ ■ (1) 

Because most components of the basis tensor B are 
zero, as in the one-dimensional case, we recover the same 
local support property, albeit with more contributing ba- 
sis functions due to the increased dimensionality of the 
problem. 

3. Determining B-spline coefficients 

3.1. Ordinary least squares 

The coefficients of the B-spline basis functions can be 
found by the usual least-squares formulation: 

B'^Bx ^ B''y (2) 

where x is the vector of B-spline coefficients, y the data 
points in the table, and B a matrix with one column for 
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each basis spline; the rows of B are formed by evaluating 
the spline at each data point in y. The value of x can be 
found by standard algebraic techniques such as Cholesky 
decomposition, but the possibility for overfitting and ring- 
ing can cause substantial difhculties when using the result- 
ing fits. 

3.2. Penalized Least Squares 

Ringing in the fit surface can be reduced substantially 
by giving the curve some intrinsic rigidity to follow the 
general trend instead of following the fluctuations. Using 
Tikonoff rcgularization with splines [T] allows us to ac- 
complish this. This is implemented in the context of least 
squares fitting by adding a term to the equations. Instead 
of solving Equation [2j we solve the penalized equation: 

{B''B + XP''P)a^B''y (3) 

The solution to this equation now minimizes — 
-I- A||Pq;||. a is then the strength of the rcgularization 
(called the smoothing parameter hereafter). For A = we 
recover the unpenalized normal equations ([2| . For A — >■ cx) , 
the rcgularization term dominates, and we end up with a 
curve of the shape specified by the rcgularization term, 
which is a form of Bayesian prior (for order-2 Tikonoff 
rcgularization, this will be a straight line). Details of the 
form of this matrix can be found in [1] . 

3.3. Penalized Least Squares in Multiple Dimensions 

In principle, the extension of least squares fitting to 
multiple dimensions is easy. As before, we have a set of 
(n-dimensional) basis functions B, to be fit at a set of n- 
dimensional positions to a measured set of data y. With 
some smoothness constraint on each dimension, we then 
want to minimize 

\\Ba-y\\ + Xi\\Pia\\ + AaHPaaH + • • • + A„||P„a|| (4) 

In analogy to Equation |3j we can write down the cor- 
responding system of linear equations: 

(^B^B + J2 ^^PIP^^ « = (5) 

The difficulty here arises that B can be infeasibly large 
in the multi-dimensional case. For 25 knots on each of 4 
axes, and a table with 200 million cells (typical numbers 
for a single IceCube photon propagation table), B will 
have dimension 390625 by 200 million and require 568 TB 
of RAM. 

There are two important things to notice: the first is 
that B is quite sparse. Because of the local support prop- 
erty of B-splines, most of the basis functions have no sup- 
port at most of the data points. As such, we can eliminate 
all of the zeroes from the matrix using sparse-matrix rou- 
tines. This reduces memory consumption in the above 
problem by a factor of 100. 



The second thing is that B is not present in isolation 
in Equation [5j but instead only occurs as the combina- 
tion B'^B. Whereas B has dimensions of the number of 
basis functions by the number of data points, B'^ B is a 
square matrix of side length the number of basis func- 
tions. For the example problem above, being able to work 
directly with B"^ B without the need to ever explicitly form 
B would reduce memory consumption by another factor of 
500. This also means, critically, that neither the memory 
requirements nor the CPU requirements of this fitting al- 
gorithm would depend on the number of data points, but 
instead only on the number of basis functions. Working 
directly with B'^ B can be accomplished using GLAM [7j. 

3.4- Table Stacking and Pseudointerpolation 

For each run of the IceCube photon simulation [T^] we 
intend to fit, we produce 4-dimensional light distributions 
for a single source configuration. For a table-based simu- 
lation, many source configurations are simulated and the 
distributions for an arbitrary source produced from inter- 
polation between the tables for similar sources. For spline 
tables, we wish to be able to do this as well. The easi- 
est solution, to do a global fit, is impracticable even using 
GLAM [7]. In addition, because there is no bin averag- 
ing between sources, smoothing (and thus least-squares 
fitting) is unnecessary, and we can stick to interpolation 
algorithms. 

What we do then is to "stack" the individual 4-dimensional 
tensor-product spline surfaces into a single 6-dimcnsional 
tensor-product spline surface. The normalization of the 
spline basis functions is such that we can simply stack the 
fit coefficients of each sub-table while creating a knot vec- 
tor such that the original values of the sub-tables form 
control points for the interpolating splines in the extra di- 
mensions. 

Given a knot vector k, the maxima of the B-splines de- 
fined on that vector can be found by solving the equation 
for the first derivatives of each spline [TTl Chapter X, Equa- 
tion 12]. Given knots spaced at equal intervals Ak, these 
also are the centers of each basis function and can be found 
at: 

— ' -t Ti — 1 -* 

BT"" = k -^Ak (6) 

By inverting Equation |6j we can acquire a formula for 
a knot vector given a vector of centers x: 

n — 1 

k^x-\ (7) 

We must also add points to the knot vector, since a 
spline function of order n made of m basis functions (and 
so m coefficients) must be defined by m+n+1 knots. Thus 
we need n + 1 extra knots, of which the first n are placed 
at linearly extrapolated positions before the positions from 
Equation [Tj and one placed at a linearly extrapolated po- 
sition after the last element. 
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Sampled Points 
Spline Pseudointerpoiation 




Figure 2: Pseudointerpoiation of a set of points sampled from sin(a:) 
using the same algorithm as for table stacking. Since B-splines do 
not interpolate their control points, but instead maximally fill the 
convex hull formed by those points, the interpolation algorithm will 
fall slightly short of extrema, while generally maintaining accuracy. 



Iterative application of Equation [7] with stacking of co- 
efficients will then produce a new tensor product spline 
surface with the sub-table locations as control points. It 
is worth noting that this surface does not exactly inter- 
polate the original tables, as B-splines do not interpolate 
their control points. Instead, it forms a curve that maxi- 
mally fills the convex hull of the set of control points (Fig- 
ure [2|. In general, this is an excellent approximation to 
interpolation, but sharp extrema will be systematically un- 
derestimated by a small amount. 

4. Cumulative B-spline surfaces 

Using a single B-spline surface to represent the distri- 
bution of photon arrival times for both simulation and re- 
construction purposes posed a special challenge. To be 
able to efficiently evaluate the integral of the distribu- 
tion over arbitrary time windows, we fit the cumulative 
distribution to a tensor-product B-splinc surface, using a 
modified version of the penalizcd-spline formulation [T| to 
enforce the monotonicity of the surface. In order to re- 
cover the differential distribution and other elements of 
the gradient from the cumulative form, we implemented 
an efficient method of simultaneously evaluating the value 
and gradient of a tensor-product spline surface. Lastly, 
we implemented a method of quickly - and analytically - 
convolving a spline surface with a spline kernel in order 
to approximate detector effects such as PMT transit-time 
spread during reconstruction. 

4.I. Enforcing monotonicity 

While the B-spline surface determined by the least- 
squares condition is by definition the closest fit to the 
data, it does not necessarily share some desirable prop- 
erties with the data. In particular, when the data can't be 



exactly represented as a spline surface, the fit surface may 
ring (Figure [3|. This can cause the fit surface to become 
negative in a region where the data are strictly positive 
or non-monotonic where the data are strictly monotonic. 
Although this behavior can be substantially reduced by 
regularization, it is critical that at least monotonicity be 
exactly obeyed for our application, where we fit for the cu- 
mulative photon arrival time distribution, which must by 
definition be everywhere monotonic. It is possible, how- 
ever, to express the monotonicity constraint along a single 
dimension (time, in our case) without substantial modifi- 
cation to the usual least-squares fitting procedure. 




0.0 0.2 0.4 0.6 0.8 1.0 



Figure 3: An example of the "ringing" that can occur as an artifact of 
least-squares fitting with B-splines. The sharp transitions are unrep- 
resentable in the B-spline basis, so the spline surface must fluctuate 
around the data in order to minimize the residual. This can cause 
the spline surface to become negative where the data are strictly 
positive and non-monotonic where the data are strictly monotonic. 

We implemented the method suggested by [13j , in which 
we transform one axis of the tensor-product B-spline basis 
to a cumulative T-spline ("Trapezoidal" spline) basis in 
which each basis function is the sum of the B-spline ba- 
sis functions following it. The equation for the coefficients 
of the penalized B-spline surface ([s]) can be converted to 
the T-spline basis by multiplying the basis and penalty 
matrices from the right with an upper-triangular matrix: 



B = BU = B 



1 



1\ 



(8) 



\0 ••• 1/ 

Unlike the B-splines from which they are derived, the 
T-spline basis function have highly non-local support; e.g. 
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the first T-spline basis function has support over the en- 
tire knot field. This makes evaluating the spline surface 
arduous; in the extreme case of a point in the support 
of the last T-spline basis function, evaluating the spline 
surface involves all the preceding coefficients. Luckily, we 
can use the T-spline basis for fitting only, transforming 
the T-spline coefhcients into the corresponding B-spline 
coefficients before storing them. 

The T-spline basis, however, makes it particularly easy 
to enforce monotonicity in one dimension. In this basis, 
the spline surface is monotonic if and only if all the T- 
spline coefficients are positive. While not quite as simple 
as the unconstrained problem, there exist algorithms to 
solve the non-negative least-squares problem as a series of 
unconstrained problems in a finite number of steps. 

Given A e M*^""^ and & S E^, x e M*^ is the solution 
to the non-negative least squares problem: 



min ll^x - 6II2 : a; > 



(9) 



The Karush-Kuhn- Tucker optimality conditions lead to 
the following linear complementarity problem [14j : 



y = A'^Ax - A^b : y > 0, x > 0, 



T 

X y 



(10) 



An active set approach is commonly used to find the 



solution to (10 1. The set of coefficients x is partitioned 



into a free set F and a constrained set Q (often called 
the passive and active sets, respectively). The coefficients 
in the free set are the solution to the unconstrained least 
squares problem in the subspace of the free set, while those 
in the constrained set are clamped to zero. The problem 
then reduces to quickly and efficiently finding the partition 



of the coefficients that satisfies (10 1. The solution can be 



found iteratively, but unlike the minimization of a general 
function, it is possible to construct algorithms that are 
provably finite. 

The best known of these is the Lawson-Hanson NNLS 
algorithm [TS] . For very large problems, NNLS can be slow 
to converge, so we additionally implemented and examined 
two block-pivoting algorithms, the Portugal/ Judice/Vincente 
(PJV) algorithm [H] and Adlers BLOCKS 16 , aU using 
sparse matrix operations. Both provide substantial speed 
improvement relative to NNLS, but, due to occasional cy- 
cling behavior in the PJV algorithm, we recommend use 
of BLOCKS for all applications. BLOCKS provides rapid 
convergence even for very large problems (Figure |4j [5]) and 
converges deterministically in all circumstances. 

4-2. Gradients of spline surfaces 

Even though we fit the cumulative distribution of pho- 
ton arrival times, we still need the probability density 
functions for sampling as well as the other dimensions 
of the gradient for accelerating minimizer convergence in 
maximum-likelihood fitting. Using the de Boor recursion 
relation [11, Chapter X] and the multiplication rule, we 
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Figure 4: Number of iterations and complete Cholesky factoriza- 
tions needed for the BLOCKS algorithm to converge when fitting 
large multi-dimensional photon tables (100 million table cells and 
400 thousand spline coefficient). 



can easily construct a new set of basis functions that are 
the derivatives of B-splines: 



{x - h)Bl,_^{x) + Bl,_^{x) 



i+n+l 



x)B^:L\{x)-Blt\{x) 



(11) 



(12) 



We can unpack one of these recursions in order to write 
the derivatives as simple linear combinations of two lower- 
order B-splines: 



-B'„t\{x) 



H+n+1 



(13) 

The lower-order basis functions needed to form the deriva- 
tive basis can be obtained at almost no additional compu- 
tational cost by pausing the bottom-up calculation of B^^ 
pn Algorithm BSPLVB] at order n — 1 before proceeding 
to the final step. 

Because differentiation and addition commute, we can 
interchange these differential spline basis functions on the 
axis of interest with the original ones, while keeping all 
the other axes and the coefficient table unchanged. Keep- 
ing the coefficients unchanged also allows us to efficiently 
evaluate the spline surface and its gradient using SIMD op- 
erations. In our case, evaluating the spline surface and the 
6 elements of its gradient in sequence takes 7 times as long 
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(c) Average CPU utiUzation (d) Memory usage sampled 
every 30 seconds over 400 
hours of fitting. 

Figure 5: Resource consumption of penalized spline fits. Shown 
are two extremely large fitting jobs (100 million table cells and 400 
thousand coefficients) running in parallel on a 16-core, 2.3 GHz AMD 
Opteron system with 64 GB of memory using SuiteSparse ^ and 
GotoBLAS 2 1101 . The memory usage peak at 20 GB represents the 
memory needed for the initial factorization of the full A'^ A, while 
the broader, lower-memory peak represents the memory needed for 
the factorization of only the unconstrained parts of A. 



as a single evaluation, while the same computation imple- 
mented in terms of operations on 4-element basis vectors 
takes slightly less than twice as long. 

^.3. Analytic convolution of spline surfaces 

While scattering processes account for the vast major- 
ity of the time-delay distribution in detected photons, it 
can be important to account for detector effects as well 
when reconstructing events. In particular, the propaga- 
tion of current pulses through the photomultiplier tube 
and inter-DOM clock synchronization [2] introduce small 
timing uncertainties in the measured photon arrival time 
distribution. Also, while electromagnetic and hadronic 
showers are approximately point-like on the scale of the 
inter-string spacing in IceCube, they do have some spa- 
tial extent. Both of these effects can be approximately 
accounted for by convolving the photon delay-time distri- 
bution with a Gaussian. 

On a uniform knot field, a B-spline basis function of 
order n is proportional to the n-fold self-convolution of 
a constant function between two adjacent knots, and so 
the nth order uniform B-spline becomes proportional to a 
normal distribution as n — > cx). B-splines of relatively low 
order can be used to approximate Gaussian convolution 
kernels, albeit with limited support. Furthermore, it is 
possible to analytically convolve a spline surface with a 
B-spline kernel and expand the result in a new B-spline 
basis (Fig. |6|. These are useful properties, as they allow 
convolution of a spline surface by a pseudo-Gaussian kernel 
merely by manipulating its coefficients. 

The procedure for analytically convolving splines is de- 
scribed in 8J in a slightly different form than we require 
for our application, and we will review our adaptation of 
it here for the sake of clarity. As before, we will use i?^ 
to denote the ith B-spline of order m on a knot vector t 
with the usual de Boor normalization such that the set of 
splines on t sum to 1 everywhere. The convolution ker- 
nel will be represented by a spline on a knot vector 
r, but normalized so that each individual basis function 
integrates to 1. 

The convolution of B*,, and can be writterj^ in 
terms of dummy variables x and y as 



iBl*Mi){z) = {t,+„,+,-t,) 



n\{m + 1)! 



(to + n + 1)! 

X [ti , . . . , t'i+m+l]a: {'^j : ■ ■ • 7 '^j-\-n-\-l\y 

xix + y- (14) 

where [ti, . . . ,ti+k]x is the divided-difference operator 
with respect to x and (• — is the truncated power func- 
tion 



{a-b)l 



(a - 6)" b<a 
b> a 



(15) 



^This is a variation on Equation 13 of [8]. 
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Furthermore, B^^ ★ is a B-sphne surface of order 
f n+l on the combined knot field p= {U, . . . , U^m+i}^ 
^ Given this fact and the ability to eval- 
z) for arbitrary z, we can expand the 



{■ 

uate (B;, * Ml 
convolution in its B-spline coefficients. 

A B-spline surface of order m defined on a knot field p 
is a linear combination of B-splines: 

f{x)^Y.''^^rn,p (16) 

i 

The coefficients of the B-sphne expansion are given by 



•i+l, 



J Pi+m) 



(17) 



where € [pi,Pi+m) a-nd B is the multilinear blossom 
of / 17 . The multilinear blossom is a function that is 
linear in each of its m arguments and is identical to / when 
evaluated on its diagonal, that is, when all its arguments 
are identical. The blossom of 5^ ★ is given by 



B{{B'L* Ml){z)){pi+i, . . . ,p,+™+„+l) ^ (ti+m+l - U) 

nl{m + 1)! 



- [t^ , . . . , ^i-f 7n+l]a; {'^j : ■ • ■ : '^j+n+l\y 

i-\-m-\-n-\-l 



(m + n + 1)! 

(18) 



l=i+l 



where & is the Heaviside step function. 

In our application the convolution kernel is given by a 



single unit-normalized spline M^, and we can apply (17) 
to write the coefficients a of the B-spline expansion of the 
convolved surface in terms of the coefficients b of the un- 
convolved surface as 

a = Tb (19) 
where T is a matrix whose elements can be found by 



evaluating the blossom defined in ( 18 ) 



T,j = B((B^ ★ Af^)(p,))(p,+i, . . . ,pi+™+„+i) 



(20) 



This is particularly useful for tensor-product spline sur- 
faces: the convolution can be carried out along a given 
dimension by calculating the transformation matrix once 
and then applying it to each slice of the coefficient grid 
along that dimension. 



^The combined knot field is formed by m + 2 copies 
of {tj , . . . , rj4-„+i }, each centered on an element of 
{ti, . . . For example, {0,1,2} 1+) {-0.1,0,0.1} = 

{-0.1,0,0.1,0.9,1,1.1,1.9,2,2.1}. For a proof that B^i * 
is contained in the spline space of order m + n + 1, sec Theorem 8 

of El. 
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Figure 6: An example of spline convolution. Here, an order-1 spline 
surface on a non-uniform knot field is convolved with an order-1 
spline, resulting in an ordcr-3 spline surface. 



5. Conclusion 

Penalized splines provide a number of attractive prop- 
erties for handling detector-response simulations in high- 
energy physics experiments where alternative parametriza- 
tions are not practicable: fast and efficient deterministic 
fitting, even for very large datasets, integrated smooth- 
ing and extrapolation, and the ability to easily perform 
a variety of mathematical operations. The conventional 
penalized spline technique [1] was extended here with new 
tools and with existing algorithms O [151 US] for convolu- 
tion of the resulting spline tables and use with datasets 
typical in high-energy physics. Use of multi-dimensional 
tabulated Monte Carlo data in maximum-likelihood fits 
can be critical for many complicated situations such as de- 
tailed particle interaction reconstruction in non-segmented 
inhomogeneous detectors such as IceCube [S]. Most other 
tabulation strategies (multilinear interpolation, polynomi- 
als) quickly become intractable as the dimensionality of 
the problem increases due to numerical issues or compu- 
tational footprint, rendering problems that can be easily 
solved in principle impossible or extremely difficult in prac- 
tice. Use of spline surfaces, along with the ability to per- 
form on-the-fly differentiation, integration, and convolu- 
tion, solves many of these problems and greatly improves 
the ability to use such tables in maximum likelihood fits. 
Similar problems of detector response parametrization oc- 
cur across high-energy physics; use of penalized splines 
may help with related problems in many detectors. 
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